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(57) Abstract 

A method and apparatus for selecting a 
quantizer scale (170) to maintain the overall qual- 
ity of the video image while optimizing the coding 
rate. A quantizer scale is selected for each mac- 
roblock such that the target bit rate for the picture 
is achieved while an optimal quantization sea e ra- 
tio is maintained for successive macroblocks to 
produce a uniform visual quality over the entire 
picture. One embodiment applies the method to 
the frame level while another embodiment applies 
the method in conjunction with a wavelet trans- 
form. 
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APPARATUS AND METHOD FOR OPTIMIZING THE BATE 
CONTROL IN A CODING SYSTEM 

This application claims the benefit of U.S. Provisional 
5 Applications No. 60/007,014 filedOctober 25, 1995, No. 60/007,016 filed 
October 25 1995 and No. 60/020,872 filed June 28, 1996. 

The present invention relates to an apparatus and concomitant 
method for optimizing the coding of motion video. More particularly, 
this invention relates to a method and apparatus that recursively adjusts 
10 the quantizer scale for each macroblock to maintain the overall quality of 
the motion video while optimizing the coding rate. 

p^i^oTTNt. OF THF, INVENTION. 
The increasing development of digital video technology presents 
15 an ever increasing problem of reducing the high cost of video 

compression codecs (coder/decoder) and resolving the inter-operabihty of 
equipment of different manufacturers. To achieve these goals, the 
Moving Picture Experts Group (MPEG) created international standards 
11172 and 13818, which are incorporated herein in their entirety by 

20 reference. 

In the area of rate control, MPEG does not define a specific 
algorithm for controlling the bit rate of an encoder. It is the task of the 
encoder designer to devise a rate control process for controlling the bit 
rate such that the decoder input buffer neither overflows nor 
25 underflows. Thus, it is the task of the encoder to monitor the number of 
bits generated by the encoder, thereby preventing the overflow and 
underflow conditions. 

Currently, one way of controlling the bit rate is to alter the 
quantization process, which will affect the distortion of the input video 
30 image. By altering the quantizer scale, the bit rate can be changed and 
controlled. Although changing the quantizer scale is an effective 
method of implementing the rate control of an encoder, it has been 
shown that a poor rate control process will actually degrade the visual 
quality of the video image. 
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each m K,T ent C ° ding Strate8deS " the " Uanti " r for 

each macroblock „ selected by assuming that a!, the pictures of the 
same type have identical complexity within a group of pictures 
However, 4e quantizer scale selected by this criterion may not achieve 

rrte. performance ' si ™ - — - - — - 

si m i.ar F "!br 0re . enC<>derS th3t Utm26 ****** tranSfo ™ s »-e 
s nular problems. For example, one such global-type compression 

technique appears in the Proceedings of the Internationa! Conference on 

STT r ^ Si8nal PrOC ™ F ~ C - ^. March 
1992, volume IV, pages 657-660, where there is disclosed a signal 
compression system which applies a hierarchical subband 
decomposition, or wavelet transform, followed by the hierarchical 
successes approximation entropy-coded quantizer incorporating 
zerotrees. The representation of signal data using a multiresolution 
h.erarch.cal subband representation was disclosed by Burt et al. in IEEE 
Trans on Commun., V„, Com-31, No. 4, April 1983, page 533. A wavelet 

T ^ 38 CritiCa " y SamP,6d Mature-mirror fi.J 
(QMF) subband representation, is a specific type of multiresolution 
Merarclucal subband representation of an image. A wavelet pyramid 

t K * L * 4 Utah ' A QMF SUbband been 
descnbed m "Subband Image Coding", J.W. Woods ed., Kluwer 
Academic Publishers, 1991 and I. Da ul >echies, Ten Lectures on 
WaveUts, Society for Industrial and Applied Mathematics (SUM)- 
2 ill P ?: Pa .' 1992 - Furthe —-U.S. patent 5,412,741 issued I May 

method for encoding information with a high degree of compression 

The output bit stream from a video encoder tends to have a 
vanable bit rate that fluctuates according to scene contents and the 
nature of the coding process used by the encoder. As such, the encoder 
reqmres a mechanism to regulate the output bit rate to match the 
char nel rate with minimum loss of signal quality 
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Therefore, a need exists in the art for an apparatus and method to 
m aintain the overall quality of the video image while optimizing the 
coding rate. Similarly, encoders that utilize global-type transforms such 
as wavelet transforms have special requirements that are not met by the 
prior art rate control techniques. 

SUMMABX ™ ^ TTJVTINTION 
The present invention is a method and apparatus for selecting a 
optimal quantizer scale to maintain the overall quality of the video mage 
while optimizing the coding rate. Namely, a quantizer scale is selected 
for each macroblock such that target bit rate for the picture is aclueved 
while an optimal quantization scale ratio is maintained for success.ve 

macroblocks to produce a uniform visual quality over the entire p.cture. 

One embodiment applies the method to the frame level while another- 

embodiment applies the method in conjunction with a wavelet 

transform. 

^TF.F nKSC^TPTTOK OF THF DRAWINGS 
The teachings of the present invention can be readily understood 

by considering the following detailed description in conjunction wrth the 

accompanying drawings, in which: 

FIG. 1 illustrates a block diagram of the apparatus of the present 

invention; 

FIG 2 illustrates a flowchart for deriving the optimal quantizer 
, scale in accordance with a complexity model for controlling the bit rate 

of the apparatus; 

FIG 3 illustrates a flowchart for deriving a modifier to the 
quantizer scale based upon the constraint of an optimal quantization 

FIG. 4 illustrates a flowchart for a rate control method that uses 



ratio; 



the actual data resulting from the encoding process to directly compute 
the quantizer scale for the n 
FIG. 5 illustrates a fl< 
of bits T P (n) for the n frame; 



the quantizer scale for the next macroblock; 

FIG. 5 illustrates a flowchart for calculating the projected number 
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FIG. 6 depicts a block diagram of a wavelet-baaed encoder 
.incorporating the present invention; 

FIG. 7 is a graphical representation of a wavelet tree; 
5 control I" " ^ blOCk diagTam ° f ° rate ™ller for 

rate of a ™ zer ^ fte w — 

been ^L^T**" ™ d ™'^ identical reference numerals have 
— t™ 6 ' * ^ — - - 

DETAIf.KD nKsnBitxr^r 

inventWor d^a^ ^ " ^ 100 ° f ~ 

the overal^Z of th T 803,8 ^ to 

In th. r f ^ while controlling the coding rate 

" "en?:r 0 :; 0 ;tr of the ~ - ~ - 

* Portion of a more complex block-based motion 

compensation codintr svstem ™°uon 

uuing system. The apparatus 100 comprises a m«fi^ 
estimation module 140 « tw,« uprises a motion 

oauie I4U, a motion compensation module 150 a rat* 
control module 130 a DPT mn * i ™ ' e 

com ■ DtraC ] t ° rll5and --mnnwr 155 .Although the apparatus 100 
Mmpnses a Pl ural ity of modules , s^in ^-^^" 
he tlons performed by ^ varioug ffioduUs ^ J-^at 

isolated into separate modules as shown in FIG 1 F„ r 7 
of modules comprising the motion compensa^n « 
nation module 1V5 and inverse OCT module 165 is^ 
known as an "embedded decoder". 

which 1 ; UUatrateS an iDpUt video image (image sequence, 110 
wmch „ dupfczed and represented as a luminance and two color 
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difference signals (Y, C r , C b ) in accordance with the MPEG standards. 
These signals are further divided into a plurality of layers (sequence, 
group of pictures, picture, slice, macroblock and block) such that each 
picture (frame) is represented by a plurality of macroblocks. Each 
5 macroblock comprises four (4) luminance blocks, one C r block and one C b 
block where a block is denned as an eight (8) by eight (8) sample array. 
The division of a picture into block units improves the ability to discern 
changes between two successive pictures and improves image 
compression through the elimination of low amplitude transformed 
10 coefficients (discussed below). The digitized signal may optionally 
undergo preprocessing such as format conversion for selecting an 
appropriate window, resolution and input format. 

The input video image on path 110 is received into motion 
estimation module 140 for estimating motion vectors. A motion vector is; 
1 5 a two-dimensional vector which is used by motion compensation to 

provide an offset from the coordinate position of a block in the current 
picture to the coordinates in a reference frame. Because of the high 
redundancy that exists between the consecutive frames of a video image 
sequence, a current frame can be reconstructed from a reference frame 
20 and the difference between the current and reference frames by using 
the motion information (motion vectors). The reference frames can be a 
previous frame (P-frame), or previous and/or future frames (B-frames). 
The use of motion vectors greatly enhances image compression by 
reducing the amount of information that is transmitted on a channel 
25 because only the changes between the current and reference frames are 
coded and transmitted. Various methods are currently available to an 
encoder designer for implementing motion estimation. 

The motion vectors from the motion estimation module 140 are 
received by the motion compensation module 150 for improving the 
30 efficiency of the prediction of sample values. Motion compensation 

involves a prediction that uses motion vectors to provide offsets into the 
past and/or future reference frames containing previously decoded 
sample values that are used to form the prediction error. Namely, the 
motion compensation module 150 uses the previously decoded frame and 
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the mofon vectors t0 ronstruet ^ est . mate ^ ^ 

Furthennore, those staled in the art will realize that the Actions 

performed by the motion estimation module and the motion 

5 single block motion compensator. 8 " 

Furthermore, prior to performing motion compensation 
pred.ct.on for a given macroblock, a coding mode must be selected In 
*. area o coding mode decision, MPEG provides a plurality of Affect 

0 ZTT H T" m ° deS General ' y ' ^ — « ^ed 

n a T . ClaSSifiCati0n5 ' **« — and intra modf coTn^ 
Intra mode codmg mvolves the coding of a macroblock or picture that 
uses ^formation only from that macroblock or picture. Conve e y 
u»ter mode coding involves the coding of a macroblock or picture that 
uses mformation both from itself and from macroblocks and picture 
occurrmg at different times. Specifically. MPEG-2 provides mlcrTbTock 

fori 1 fr t ame/field/d -'-Pri»e motion compensation inter mode 
^ard/backward/average inter mode and field/frame DCT mode The 
proper selectmn of a coding mode for each macroblock will improve 
codmg performance. Again, various methods are current* available to 
an encoder designer for implementing coding mode decision 

Once a coding mode is selected, motion compensation module 150 
g nerates a motion compensated prediction (predicted image) on path 
152 of the contents of the block based on past and/or fixture reference 
Pictures. Tins motion compensated prediction on path 152 is subtracted 

I " ft 01 - 115 *- *• ^o ^^ath llO in the curren 
gcroH ock to form an error ^ P^^^,^ 

153. The format™ of the predictive residual signal effective* removes 
redundant information in the input video image. Name.y, insWof 
transmuting the actual video image via a transmission channel, on,y 
the mformation necessary to generate the predictions of the video image 

redu T" ° f th6Se PrediCti ° nS ^ tranSmitted " ^ 

reducmg the amount of data needed to be transmitted. To further reduce 
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the bit rate, predictive residual signal on path 153 is passed to the DCT 
module 160 for encoding. 

The DCT module 160 then applies a forward discrete cosine 
transform process to each block of the predictive residual signal to 
produce a set of eight (8) by eight (8) block of DCT coefficients. The 
discrete cosine transform is an invertible, discrete orthogonal 
transformation where the DCT coefficients represent the amplitudes of a 
set of cosine basis functions. One advantage of the discrete cosine 
transform is that the DCT coefficients are uncorrected. This 
decorrelation of the DCT coefficients is important for compression, 
because each coefficient can be treated independently without the loss of 
compression efficiency. Furthermore, the DCT basis function or 
subband decomposition permits effective use of psychovisual criteria 
which is important for the next step of quantization. 

The resulting 8x8 block of DCT coefficients is received by 
quantization module 170 where the DCT coefficients are quantized. The 
process of quantization reduces the accuracy with which the DCT 
coefficients are represented by dividing the DCT coefficients by a set of 
quantization values with appropriate rounding to form integer values. 
The quantization values can be set individually for each DCT coefficient, 
using criteria based on the visibility of the basis functions (known as 
visually weighted quantization). Namely, the quantization value 
corresponds to the threshold for visibility of a given basis function, i.e., 
the coefficient amplitude that is just detectable by the human eye. By 
quantizing the DCT coefficients with this value, many of the DCT 
coefficients are converted to the value "zero", thereby improving image 
compression efficiency. The process of quantization is a key operation 
and is an important tool to achieve visual quality and to control the 
encoder to match its output to a given bit rate (rate control). Since a 
) different quantization value can be applied to each DCT coefficient, a 
"quantization matrix" is generally established as a reference table, e.g., 
a luminance quantization table or a chrominance quantization table. 
Thus, the encoder chooses a quantization matrix that determines how 
each frequency coefficient in the transformed block is quantized. 
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quantization matrix depends on many LeZ M " 
characteristics of the intended display S^T-^T^ " ^ 
amount of noise in the source. Z « ^sT^ ^ ^ ^ 

selection of a quantizer scale is nerform^ k «. ^proper 
Next £hp .♦■ o formed by the rate ^ntrol module 130 

sequential ordering of the DCT coefficients thTjowestTpatial 
SZ. i T ^ SiDCe — ati0D — ^ reduces^DCT 

:z xij r iffi 9u r es to zero ' the — 

integers ^ * "~ 

Variable length coding (VLC) module 180 th^ ^ At 

: ~ t coefflcients - au 1 - 

macroblock such as macroblock type and motion vectors The VLC 

module 180 utili.es variable len^din^and^e^^ 

efficent y lmp rove coding efficiency. Variable length cTdingt a 
reverse coding process where shorter code-wordfare asITed to 

events, while run-length coding increases coding efficiency hv 1 j. 
a run of symbols with a singie symbol. These JCZZ^^ 
known in theart and are often referred to as HuffinL c^ ZI 
-teger-length code words are used. Thus, the VLC „odu,e 180 
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performs the final step of converting the input video image into a valid 
data stream. Those skilled in the art will realize that the VLC module 
can be replaced with other types of entropy coders. 

The data stream is received into a "First In-First Out" (FIFO) 
buffer 190. A consequence of using different picture types and variable 
length coding is that the overall bit rate into the FIFO is variable. 
Namely, the number of bits used to code each frame can be different. In 
applications that involve a fixed-rate channel, a FIFO buffer is used to 
match the encoder output to the channel for smoothing the bit rate. 
Thus, the output signal of FIFO buffer 190 is a compressed 
representation of the input video image on path 110, where it is sent to a 
storage medium or telecommunication channel via path 295. 

The rate control module 130 serves to monitor and adjust the bit 
rate of the data stream entering the FIFO buffer 190 to prevent overflow :, 
and underflow on the decoder side (within a receiver or target storage 
device, not shown) after transmission of the data stream. Thus, it is the 
task of the rate control module 130 to monitor the status of buffer 190 to 
control the number of bits generated by the encoder. 

In the preferred embodiment of the present invention, rate control 
module 130 selects a quantizer scale for each macroblock to maintain the 
overall quality of the video image while controlling the coding rate. 
Namely, a quantizer scale is selected for each macroblock such that 
target bit rate for the picture is achieved while an optimal quantization 
scale ratio is maintained for successive macroblocks to produce a 
uniform visual quality over the entire picture. 

Specifically, the rate control module 130 initially obtains a rough 
estimate of the complexity of a specific type of picture (I, P, B) from 
previously encoded pictures or by implementing the TM4 and TM5 
methods. This estimated complexity is used to derive a predicted 
number of bits necessary to code each macroblock. With this knowledge, 
a quantizer scale is calculated for the macroblock in accordance with a 
complexity model having a polynomial form. This complexity model is 
derived to meet the constraint that the selected quantizer scales for the 
macroblocks should approach the target bit rate for the picture. 
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Furthermore, the quantizer scale is optionally refined by a 
modifier which is derived to meet a constraint that requires a constant 
visual quality to be maintained for the entire picture. Namely the 
constraint requires an optimal quantization scale ratio to be maintained 
5 for success™ macroblocks. The rate control module applies the 

modifier to the quantizer scale to produce an optimal quantizer scale 
whtch is used to code the macroblock. Once the macroblock is encoded 
the rate control module recursively adjusts the complexity model 
through the use of a polynomial regression process. That is, the actual 
number of bits necessary to code the macroblock is used to refine the 
complexity model so as to improve the prediction of a quantizer scale for 
the next macroblock. A detailed description of the quantizer scale 
select™ method is discussed below with reference to FIG. 2 and FIG 3 

Returning to FIG. 1, the resulting 8 x 8 block of quantized DOT 
coefficients from the quantization module 170 is also received by the 
inverse quantization module 175 via signal connection 172. At this stage 
the encoder regenerates I-frames and P-frames of the input video image' 
by decoding the data so that they are used as reference frames for 
subsequent encoding. The inverse quantization module 175 starts the 
decoding process by dequantizing the quantized DCT coefficients 
Namely, the quantized DCT coefficients are multiplied by a set of 
quantization values with appropriate rounding to produce integer 
values. 

The resulting dequantized 8x8 block of DCT coefficients are 
passed to the inverse DCT module 165 where inverse DCT is applied to 
each macroblock to produce the decoded error signal. This error signal 
.sad^b^cktothe predi ction signal from the motion compensation 
module via summer 155 to produce a decoded reference picture 
(reconstructed image). Generally, if an I-frame or a P-frame is decoded 
it will be stored, replacing the oldest stored reference. Thus an 
apparatus 100 for selecting a quantizer scale for each macroblock to 
maintain the overall quality of the video image while optimizing the 
coding rate is disclosed. 
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FIG. 2 depicts a flowchart for deriving the optimal quantizer scale 
in accordance with a complexity model for controlling the bit rate of the 
apparatus in the preferred embodiment of the present invention. To 
develop the preferred embodiment of the present invention, an 
optimization problem was formulated for the selection of the quantizer 
scale. The solution is based upon the rate-distortion characteristics or 
R(D) curves for all the macroblocks that compose the picture being 
coded. Based upon the results, a method for selecting the quantizer 
scale for each macroblock with less complexity for practical 
implementation is presented. 

The first constraint for the optimal solution is: 



(1) 



which states that the target bit rate for a picture, T, is measured as an 
accumulation of the bits allocated to individual macroblock, R i5 for all N, 
the total number of macroblocks in the picture. 

The second constraint for the optimal solution is: 

Q a x k, = = Q N x 1^ (2) 

which states that the product for the macroblock i of the quantizer scale, 
Q„ and a human visual system weighting, K should be equal to the 
product of Q and k for any other macroblock on the picture to maintain a 
constant visual quality. In effect, there exists a set of optimal 
quantization scale ratios k;....k N ,'so that the whole picture has equal 
overall quality which can be alternatively expressed as: 

&=hL = =k/....k N ,' (3) 

where k s = k,/^ k N ./ = VAn. 

The third constraint for the optimal solution is: 
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« * «. = W (4) 



wh,ch states that the complexity measure, X,, f or the macrobIock , jg , 
of a metric v, or is desert in terms of the product ^ » " 
rate and hzer scale rf ^ macrobiock . ^ 

We"he' S f ** *" ^ ^ - «- — 

layer. In the preferred embodiment, the metric v, is the variance 

computed over the pixels in the macrobiock i 

The method 200 of the present invention as depicted in FIG 2 is 
formulated t0 derive a quantizer scale for each macrobiock 

the above constraints. The solution should reach the target b^ate 
whde -ntaunng the relative ratios of al, the quantizer scales so 1" 
the v 1S ual quality is uniform within one picture or frame 

step 2X0 tr ng t H° FIG - 2 ' ^ meth ° d beginS 3t ^ 205 "™» P^ceeds to 
step 2 10 where the method adopts an initial model having the 

lido J ° f R ' = m ^ (eqUati0 " 4> t0 ^ R " «* located to 
code the current macrobiock i. This initial mode! acquires an initial 
predion o the complexity X„ X P and X B for each type of pictuT I P and 

methods such as TM4 and TM5. The complexity for each type of picture 
» denved from the number of bits generated by encoding el pictl 
-d an average of the quantizer scales used to code the Lrobfo^ 
* P-ture. Smce the initial model assumes that pictures of simila Type 

S£T C ° mpleXity> then *' can be quick * «- -T 

210 for the current macrobiock from the previously encoded picture The 
prechcted R, for the current macrobiock is pa^d to.step 220*^,2 
for an appro priate quantizer scale. _ CalCUlate 

rt \YT 22 °' meth ° d US6S am »«-urate complexity model to 
calculate the quantizer scale which is expressed as: 
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where R ; is the bits allocated to the macroblock i, Q ; is the quantizer scale 
of the macroblock i and X,, X : and X, are constants. At the beginning of 
the coding process, the constants X 0 and X,, are set to zero. This 
effectively reduces equation 5 to the initial model of equation 4. Since 
there is insufficient data at this early stage of the coding process, 
equation 4 is used to acquire a rough estimate of the quantizer scale for 
the current macroblock. Namely, the selected quantizer scale should be 
suitably an average of the quantizer scales used to code the macroblocks 
in the previous picture. 

In step 230, the method calculates a modifier, y, based on a 
constraint that a set of optimal quantization scale ratios be maintained. 
This modifier is multiplied to the quantizer scale to produce an optimal 
quantizer scale, Q i(oplimal „ such that a constant visual quality is 
maintained throughout the entire picture. The method of generating; the 
modifier is discussed in detail below with reference to FIG. 3. 

In step 240, the method encodes the macroblock i by using the 
optimal quantizer scale calculated from step 230. The encoding method 
produces the actual number of bits needed to encode the macroblock i 
which is passed to step 250. 

In step 250, the method uses the optimal quantizer scale used to 
code the macroblock i and the actual number of bits needed to encode the 
macroblock i in a polynomial regression model or a quadratic regression 
model to refine the complexity model of step 220. Namely, the constants 
Xo, X 1 and X? are updated to account for the discrepancy between the bits 
allocated to the macroblock i and the actual number of bits needed to the 
code the macroblock for a particular quantizer scale. Regression models 
are well known in the art. For a detailed discussion of various 
regression models, see e.g., Bowerman and O'Connell, Forecasting and 
Time Series . 3rd Edition, Duxbury Press, (1993 , chapter 4). 

In step 260, method 200 queries whether there are additional 
macroblocks that remain to be coded in the current picture. If the query 
is affirmatively answered, method 200 returns to step 220 to calculate a 
new quantizer scale for the next macroblock with the updated constants 
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X» X and X,. If the query is negatively answered, method 200 proceeds 
to code the next picture or end. Proceeds 

modified I K iUUStrateS 8 meth ° d <8teP 230 ° f FI °- 2) for d -vin g a 
modifier to the quanta scale based upon the constraint of an optima, 

StTj* ^ — "* - *P 305 and proceeds to £ 

310, where the method calculates a set of human visual system 
we.ght.ng, k,...k N in accordance with the formula: 

k _ 2* Act! +avg_Act 
' Act l+ 2* a vg_Act 1 = °.1.2,...,N. ( 6 ) 



and 

Acti = 1+min (Var_sblk) 



(7) 



where Act .s a spatial activity measure for the macroblock i. Act is 

T g ^ ° riginaI ^ — *- «" ^ llest of the four 
(4) lummance ^frame-organized sub-blocks and the four (4) luminance 
field-orgamzed sub-blocks. Var_sblk is expressed as: 

and 



P_mean= — V p 



64-£, (9) 



where P k are the original pixel values in the original 8 x 8 sub-block 

In the preferred embodiment of step 310, the metric used to 
emulate t he sst of human visual system weightings is the variance 
computed over the pixels in the macroblock i. The set of human visual 

TncoZng^" a " maCr ° WOCkS f ° r 3 PiCtUre iS Cal <" Prior to 

In step 320, the method sums the set of human visua] system 
we.ghtings to derive a measure, K, which represents the total human 
vsual system weighting for a picture. Thus, K is expressed as- 
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1=0 

In step 330, the method obtains a sum of all the human visual 
system weightings up to k M which is represented as: 



(10) 



5 



(id 



In effect, this step computes the sum of all the human visual system 
weightings up to the previous macroblock i-1. 

In step 340, the projected number of bits T p for the whole picture is 
computed by: 

where B M is the sum of all the bits used to code the current frame up to 
and including the previous macroblock i-1. 

In step 350, the method calculates a modifier or bit activity index 
ratio, 7, by dividing the projected number of bits, T p by the target number 
of bits for the picture T which is expressed as: 

V = Z* (13) 
' T 

This modifier is multiplied to the quantizer scale, Q, calculated in step 
220 to produce a Q i(optimal) such that a constant visual quality is 
maintained. 

In the preferred embodiment of the present invention, the 
complexity model depicted in step 220 is a second order polynomial. 
However, a simulation on a flower garden sequence was conducted to 
compare the performance of a linear complexity model, a second order 
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r 0 r i rr * modei and a twrd ° rder °°« y 

model. I„ deterging the performance of these three methods a 
comparison was made of the fit of the model over the actual data' i e a 
ca.cmae.on of the root mean S<J uare error was calculated and compared 
5 The results are displayed in Table! below. 



Koot Mean 
Square Error 



Linear 
Complexity 
Model 



Improvement in 
% versus the 

Linear 
Complexity 
Model 



91,702.31 



2nd Order 
Polynomial 
Complexity 
Model 



26,362.81 



71.25% 



3rd Order 
Polynomial 
Complexity 
Model 



21,517.72 



76.54% 



Table 1 

mod . ' n,e / eSUltS dem °~* ^at the 2nd order polynomial complexity 
model produces an improvement of over 7!% over the linear model in 
pred IC ,ng the complexity of a picture, thereby improving the overall rate 
control of an encode, Furthermore, the results demonstrate that the 

over P , yn ° mial C ° mPleXity m ° del Pr0dU ~ S « -P-ement of 
over 76% when compared to the linear model. Although the 3rd order 
polynomml complexity model produced a better prediction, it also carries 
a 1 high er computational overh e^._.Thua,lan^ncoder-design e r must 
balance between prediction performance and computational overhead in 
selecting an appropriate complexity model. In the preferred 
embodiment, the 2nd order polynomial complexity model provides an 
accurate prediction with a moderate computational overhead 
Furthermore, if computational overhead is an important constraint for a 
particular application, then step 230 as depicted in FIG. 3 can be omitted 
to simplify the rate control process. 
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In a second embodiment of the present invention, the actual data 
resulting from the encoding process is used directly to compute the 
quantizer scale for the next macroblock. This optimization process is 
formulated from the following equation: 

R(D) = F(D) * E (14) 

where R(D) is the total number of bits used to code the picture, F(D) is the 
rate distortion function of the current block, E is the target number of 
bits to be used in this picture and X is the Lagrange multiplier. In effect, 
the Lagrange multiplier process is applied to minimize the rate 
distortion function F(D) subject to the constraint of a target bit allocation 
E for a picture. This optimization process is discussed below with 
reference to FIG. 4. 

FIG. 4 depicts a flowchart for a rate control method 400 that uses 
the actual data resulting from the encoding process to directly compute 
the quantizer scale for the next macroblock. The method begins at step 
405 and proceeds to step 410 where the method adopts an initial model 
such as TM4 or TM5 to calculate the target bit rate T„ T P , and T B for an I 
frame, P frame and B frame respectively. An alternative model is to 
simply assign the target bit rate T„ T P , and T B from the actual number of 
bits necessary to encode previous I, P and B frames. 

The method 400 computes, at step 415, a buffer fullness measure 
for each macroblock in the frame as: 



R. =R 0 + B,_ 



(15) 



where: 

R; is the buffer fullness measure before encoding the i-th 
macroblock; 

R 0 is the initial buffer fullness measure; 
Bj j is the number of bits generated by encoding all 
macroblocks up to and including the i-1 macroblocks; 
T is the target bit budget for an I, P or B frame in the 
previous I, P or B frame; and 
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N MB is the total number of macroblocks in the present 
frame. 

The buffer fullness measure is an indicator as to the amount of 
the output buffer that is presently filled with coded bits. This measure 
ensures that the encoder will not underflow or overflow the buffer and 
as a result, lose data. Thus, the method establishes a quantizer scale ' 
that varies depending upon the fullness of the output buffer. 

The method then computes, at step 420, the quantizer scale Q ; for 
the i-th macroblock as: 

a = R i ' — as) 

r - 2 - bit ratC 
frame rate 

In step 425, the method encodes the i macroblock MB, with the 
quantizer scale calculated for the macroblock from step 420. The 
resulting encoded signal for the macroblock is passed to step 430. 

In step 430, the method calculates the distortion D for the 
macroblock from the encoded signal. The distortion D is the actual 
distortion between the corresponding original macroblock of the input 
picture and the quantized macroblock. The calculated distortion is 
passed to step 435 for comparison. 

In step 435, the method queries whether the distortion has 
decreased as compared from a previous calculation. Initially, D is set at 
zero, such that the first query always produces a negative response. If 
the query is negatively answered, the method proceeds to step 440 where 
T is replaced with T-AT where AT is expressed as: 



Ar=0.05— n ~ 

(18) 



The method then returns to step 415 to repeat the process of selecting a 
quantizer scale and encoding the macroblock. If the query is positively 
answered, the method proceeds to step 450. In effect, the method has 
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determined from the actual data that the distortion is decreasing as T is 
adjusted. 

In step 450, the method queries whether the predefined number of 
iterations of adjusting T has been performed. If the query is negatively 
answered, the method proceeds to step 455 where T is again replaced 
with T-AT in accordance with equation 18. The method then repeats 
until the predefined number of iterations has been satisfied. If the query 
is positively answered, the method proceeds to step 465. In the preferred 
embodiment, T is adjusted twenty (20) times. However, the number of 
iterations can be adjusted to accommodate other factors such as speed, 
computational overhead and distortion. 

In step 465, the method selects the T that produces the smallest 
distortion. This T will be used in step 415 for the next macroblock. _ 

In step 470, the method increments i by one. In step 475, the 
method queries whether there are additional macroblocks. If the query 
is positively answered, the method proceeds to step 415 and the whole 
method is repeated for the next macroblock. If the query is negatively 
answered, the method proceeds to step 480 where the method will end or 
proceed to the next picture or frame. 

In a third embodiment, the projected number of bits T P as 
disclosed in step 340 of FIG. 3 can be calculated using the coding 
information of the previous frame or picture. More specifically, since 
successive frames are often closely correlated, the number of bits used to 
code the previous frame is used to derive the projected number of bits 
T P (n) for the n frame. 

FIG. 5 illustrates a flowchart of a method 500 for calculating the 
projected number of bits T P (n) for the n frame in accordance with the 
number of bits used to code the previous frame and the overall bit rate of 
a channel (or the bit budget for a group of pictures (GOP)). Although 
method 500 can be applied to all picture types, it is specifically well suited 
for predicting the number of bits for a P picture. However, those skilled 
in the art will realized that method 500 can be adjusted to improve the 
prediction of I and B pictures. 
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Referring to FIG. 5, the method 500 begins at step 510 and 
proceeds to step 520 where method 500 computes T„ wherp T 
expressed as: * AVO " here T >™<» 18 



TpcAvoi - Max(bitrate/frame rate, R/N) 



(19) 



where TpiATC) is the projected average number of bits needed to code a 
remaunng frame, R is the remaining number of bits and N is the 
reining number of frames. Name*, in step 520, method 500 derives 
the projected average number of bits needed to code a remaining frame 

rate or the dmsmn of the remaining number of bits in a GOP (the 
remamder of the bit budget for a GOP) by the remaining number of 
frames m the GOP. Equation (19) ^ ^ ^ 
hange m the channe, bitrate which wi„ significantly affect the bit 

per ^ Finaiiy - the frame rate is — - - - 

However, the calculation of Tp(AVG) does not account for the close 
correlation of the content in successive frames. Namely, it is cont^t 
mdependent and distributes the avai.able bits equally to the remaining 

In step 530, method 500 computes the projected number of bits 
T P (n) for the n frame from the T H4V0 „ where Tp(n) „ ej£pressed ^ 



T P (n) = T KAVO) « (1- W ) + B(n-l) * w 



(20) 



ts 7SrT" f M ^ ^ ° 05 - ^ «- ^ected number of 

nlw of "„ C ° mPriSeS 3 COmPOneM WWch ™° *e 

numb r of tat. used to code the previous frame, thereby improving the 

projects for the number of bits needed to code a frame. In turn T (n) 

can be used as discussed above in FIGs 2-4 to alter the quantiser seal to 

effect an effiaent rate control: Finally, method 500 ends in step 540 
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Furthermore, those skilled the art will realize that method 500 can 
be implemented by evaluating the number of bits spent versus the 
number of bits remaining. In addition, the weighing factor w can be 
adjusted to other values to accommodate other applications or adjusted 
in response to the content within the GOP. 

Finally, Appendix A is enclosed to demonstrate the effectiveness of 
the rate control method illustrated in method 500 as compared with the 
proposed verification models (VMs) of the upcoming MPEG 4 standard. 

FIG. 6 depicts an encoder 600 that incorporates a fourth 
embodiment of the present invention. The encoder contains a block 
motion compensator (BMC) and motion vector coder 604, subtracter 602, 
discrete cosine transform (DWT) coder 606, bit rate controller 610, DWT 
decoder 612 and output buffer 614. 

In general, the input signal is a video image (a two-dimensional * 
array of pixels (pels) defining a frame in a video sequence). To 
accurately transmit the image through a low bit rate channel, the 
spatial and temporal redundancy in the video frame sequence must be 
substantially reduced. This is generally accomplished by coding and 
transmitting only the differences between successive frames. The 
encoder has three functions: first, it produces, using the BMC and its 
coder 604, a plurality of motion vectors that represent motion that occurs 
between frames; second, it predicts the present frame using a 
reconstructed version of the previous frame combined with the motion 
vectors; and third, the predicted frame is subtracted from the present 
frame to produce a frame of residuals that are coded and transmitted 
along with the motion vectors to a receiver. Within the receiver, a 
decoder reconstructs each video frame using the coded residuals and 
motion vectors. A wavelet-based video encoder having the general 
structure of that depicted in FIG. 6 is disclosed in U.S. provisional patent 
application serial number 60/007,012, filed October 25, 1995, Attorney 
Docket Number 11908 (converted into US patent application number _ 

j filed Attorney Docket Number DSRC 11908) and 

U.S. provisional patent application serial number 60/007,013, filed 
October 25, 1995, Attorney Docket Number 11730 (converted into US 



WO 97/16029 



PCT/US96/I7204 



-22- 



patent application number , filed 0cto ber 23, 1996, Attorney 

Docket Number DSRC 11730), both of which are incorporated herein by 
reference. Both these applications discuss the use of wavelet transforms 
to encode video signals. 

This disclosure focuses on a technique for controlling the coding 
rate of the wavelet encoder. The general function of the encoder to 
produce wavelets from video sequences does not form any part of this 
invent,on and is only depicted in FIG. 6 ami discussed below to place the 
invention within a practical context. 

The discrete wavelet transform performs a wavelet hierarchical 

subband decomposition to produce a conventional wavelet tree 
representation of the input image. To accomplish such image 
decomposition, the image is decomposed using times two subsampling 
into high horizontal-high vertical (HH), high horizontal-low vertical 
HL), low horizontal-high vertical (LH), and low horizontal-low vertical 
Abends. The LL subband is then further subsampled 
times two to produce a set of HH, HL, LH and LL subbands This 
subsampling is accomplished recursively to produce an array of 
subbands such as that illustrated in FIG. 7 where three subsamplings 
have been used. Preferably six subsamplings are used in practice The 
parent-child dependencies between subbands are illustrated as arrows 
pointing from the subband of the parent nodes to the subbands of the 
child nodes. The lowest frequency subband is the top left LL, and the 
highest frequency subband is at the bottom right HH 3 . In this example 
all child nodes have one parent. A detailed discussion of subband 
decomposition is presented in J.M. Shapiro, "Embedded Image Coding 
Using Zerotrees of Wav e let Coefficients", IEEE T rans, on Signal- 
Processing, Vol. 41, No. 12, pp. 3445-62, December 1993. 

The DWT coder of FIG. 6 codes the coefficients of the wavelet tree 
in either a "breadth first" or "depth first" pattern. A breadth first pattern 
traverse the wavelet tree in a bit-plane by bit-plane pattern, i.e., quantize 
all parent nodes, then all children, then all grandchildren and so on In 
contrast, a depth first pattern traverses each tree from the root in the 
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low-low subband (LL>) through the children (top down) or children 
through the low-low subband (bottom up). 

FIG. 8 depicts a detailed block diagram of the rate controller 610 
and its interconnection with the DWT coder 606. The DWT coder 
contains a DWT 802 connected in series with a quantizer 804 which, in 
turn, is connected in series with an entropy coder 806. The output of the 
quantizer is also connected to the DWT decoder 612. The output signal 
from the entropy coder is connected to the output buffer 614. The input to 
the DWT coder is typically a sequence of frames containing motion 
compensated residuals. However, generally speaking the input 
sequence can be a series of frames containing any two-dimensional data. 
The specific nature of this data within the frames is irrelevant to the 
operation of the invention. 

The quantizer 804 is used to quantize the coefficients of the wavelet 
transform. The inventive rate controller 610 controls the quantizer scale 
(step size) depending upon a number of parameters such that a 
predefined bit budget for a predefined series of frames is not exceeded 
during the coding process. Based upon a statistical analysis of a frame 
(arbitrarily, the first frame) in a sequence of video frames, the invention 
generates a bit budget for the next frame (a second frame). This 
statistical analysis is performed upon the frames prior to 
transformation; therefore, it is said to be accomplished at the frame 
layer. Processing accomplished after transformation is said to occur in 
the wavelet tree layer. The frame layer bit budget is allocated to each 
tree extending from the low-low subband. Allocation of a certain 
number of bits per tree is accomplished according to the number of bits 
already consumed in coding previous frames within the sequence, 
coding complexity of the present frame and buffer fullness information 
provided by the output buffer. The quantization parameter for each 
coefficient in a tree is computed based upon the bit allocation for its tree. 

Assume the input video sequence to the encoder contains a series 
of frames having two types: intra frames (I-frames) and predictive 
frames (P-frames). Also, assume that an I-frame occurs in the 
sequence every F P-frames such that, for example, the sequence is: 
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. IPPPPPP . . . PPpipp 

The total number of bits G necessary to code a Group of Frames 
spanning from an I-frame to the next I-frame is: 



frame rate (2D 

Thus, for an encoder within a system that operates with a 64 kbps 
having an I-frame transmitted every 120 frames and the frames are 
transmitted at 30 frames per second, the total number of bits necessary 
to code all the frames in the group is 256 kbits, e.g., the bit budget for the 
sequence is 256 kbits. With such few bits to work with, it is apparent that 
a rate control technique that optimally allocates bit budget to the various 
frames is desirable. 

To accomplish the optimal allocation, the rate controller 610 
contains a frame layer bit allocator 808, a wavelet tree layer bit 
allocator 810, and a quantizer parameter mask generator 812 The 
frame layer allocator 808 is connected to the wavelet tree layer 
allocator 810 and both allocators are connected to the quantizer 
parameter mask generator 812. The mask generator produces a two 
dimensional array of quantizer parameters. This array is used to alter a 
nominal quantizer scale value such that an optimal bit rate is produced 
by the DWT coder 606. The operation of the frame layer bit allocator is 
discussed with respect to FIG. 9 and the operation of the wavelet tree 
layer bit allocator is discussed with reference to FIG. 10. 

A. Allocating Bits At The Frame Layer 

-^^a-depicts^flow chart of the processllOObylvhich the frame 
layer b.t allocator operates. The process begins at step 902 and continues 
with step 904. Prior to coding the first I-frame, the rate control process 
sets, at step 904, a variable R, representing the number of bits remaining 
to code the group of frames, equal to the total bit budget G. The process 
then establishes, at step 906, a target bit rate for the first I-frame in 
accordance with the following equation: 
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T = 



N„X„K, 
1+- 



with T t + N p T r = R initially; 
■ T p X p K< 



(22) 



where: 

T, is the target bit rate for the first I-frame; 

N =F-1 is the number of P frames in the sequence; 

T p is the average bit budget established for the P frames; 

R is the remaining bits available for assignment; 

X p is a complexity measure for a given P-frame; 

X, is a complexity measure for a given I-frame; 

Ki is a weighting coefficient for an I-frame; and 

Kp is a weighting coefficient for a P-frame. 
The values of Xp and X> are initially set as a function of the desired bit- 
rate as follows: 

160»bit rate , 

X: = • — ; and 

115 (23) 
_ 60* bit rate 

X "~ 115 * 

Thereafter, each iteration of the process generates updated values of X, 

and X^. The complexity measures Xp and X, are updated as the frames 

are coded. The method used to update the complexity measure are, in 

the simplest form, updated by multiplying an average of the quantization 

parameters generated for the previous frame by the number of bits used 

to code the previous picture! This simple method of establishing an 

initial complexity value and updating that value is disclosed in 

International Organization for Standardization, Coded Representation 

of Picture And Audio: Test Model 5, ISO-IEC/JTC1/SC29/WG11, Version 

1, pp. 61-64, April 1993. A better measure of complexity is disclosed 

above in FIGs. 2-3. Either method of computing the complexity 

measures is applicable to this invention. 
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The values of K, and are "universal constants" that are defined 
by the quanfzation scale value QT (described below). Generally QT is not 
a smg.e value, but rather is a ma trix of va,ues that establishes a 
nominal quantization scale for each tree in a fram e. A typical value for 
we.ght.ng functions K, and K, is 1.4 and 1.0, respectively 

The process updates, at step 908, the value of R after coding is 
complete for an I-frame: 



R=R " T ' (24, 



The process establishes, at step 910, the target bit rate for the n-th frame 
in the group of frames, a P-frame, as: 



T p "=RN p 

T„» is the target bit rate for the n-th frame. 



where: (25) 



At step 912, the process computes the bit allocation for each of the 
wavelet trees contained in the present (n-th) frame This is 
accomplished by executing the wavelet tree layer bit allocation process of 
±•11*. 10. This process is discussed in detail below. 

At step 914, the process queries whether all of the frames in the 
sequence have been processed, i.e., n=F. If the query is affirmatively 
answered, all the frames have been processed, i.e., one I-frame and (F-l) 
P-frames and the process returns to step 904 to process the next group of 
frames. If the query is negatively answered, the process proceeds to 
step 916. " " " 

-"After codingeTch frame, the variables of Equation 21 are^dated 
at step 916, as follows: 



N '- N ." 1 (26) 

r = R-b„" (27) 

n = n+1 (28) 

where: 
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B NX n is the actual number of bits used to code the n-th frame; 
and 

NT is the total number of wavelet trees representing each 
frame. 

At this point, the process has computed a bit budget for the next frame 
(n-th frame) that will be coded by the DWT coder. Next, the process must 
allocate the frame layer bit budget to each tree comprising the n-th 
frame. 

B. Allocating Bits To Wavelet Trees 

FIG. 10 depicts a flow chart of the wavelet tree layer bit allocation 
process 1000 representing the operation of the wavelet tree layer bit 
allocator. The process 1000 is entered from the process 900 at step 1002. 
As discussed above, each frame is represented by a plurality of wavelet 
trees extending from the low-low band of the decomposed input frame. 
Consequently, the coding bits allocated to the n-th frame must be 
allocated to the trees j. The process 1000 computes, at step 1004, a buffer 
fullness measure for each tree in the frame as: 

where: 

R/ is the buffer fullness measure before encoding the j-th 
tree; 

R 0 n is the initial buffer fullness measure; 
B» is the number of bits generated by encoding all wavelet 
trees in the n-th frame up to and including the j-th tree; 
T n is the target bit budget in the previous I or P frame (i.e., 
this is the approximate number of bits that will become free 
in the buffer when the previous frame is extracted from the 
buffer); and 

NT is the total number of wavelet trees in the present frame. 
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the t TT Rj ° " " indiCat ° r as to the amount of 

the output buffer that is presently fiUed with coded bits. This meas^ 
ensures that the encoder wi„ not underflow or ~J 
as a result lose data. Thus, the process estabhshes a quantization"! 
that vanes depending upon the fullness of the output buffer 

The process then computes, at step 1006, the quantization 
parameter Q," for the j-th wavelet tree as: 



e; = if. 21 

' r (30) 
bit rate 

frame rate (3D 



The quan tl zat 10 n parameter is stored in an array of such parameters 

o^LT 33 18 6d bel ° W ' f ° raS 3 maSk that ™ * -d to 

opbmally quantaze the wavelet coefficients within each tree 

At step 1008, the process queries whether all the trees in the 
present frame have been processed,!.., whether j=NT. If the query is 
affirmatively answered, the process increases, at step 1010, the trZ 

la" thBa retUmS t0 SteP 1004 * C ° mpUte the ^ess 

mea ure for the next tree in the frame. The process proceeds through 
this loop until all the trees in a frame are processed by iterating for ^ 
values until j=NT. If the query at step 100 8 is negatively answered * 

buffeTfuT *" N ° ^ to St6P 1012 ' At ^ «» 

buffer fullness value is used to update the initial buffer fullness measure 
such that the buffer fullness measure when J= NT for the n-th fraTe T 
used as the untml buffer full ness measur e for the n + l fram e. As such, 

(32) 

Once complete, the process has computed a bit allocation each tree in the 
frame and returns, at step 1014, to the frame layer bit allocation 
process 900. At this point, the rate controller has generated a 
quantization mask for the present frame. As such, the present frame 
may now be quantized and coded. 
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C. Adaptive Quantization 

The quantization parameter mask is used to establish a 
quantization step size (quantizer scale) that will result in the target bit 
rate for each frame. The quantization step (m_quant) is computed as: 

m_quant = * QT (33) 

where: 

QT is a nominal quantization scale value that may be 
constant for the entire sequence of frames; it may vary from 
frame to frame, but be constant within each frame; or it 
may vary within each frame. 

As such, each of the values in Equation 33 may be matrix quantities. 
The quantization parameters, in effect, alter the value of the nominal 
quantization scale to ensure that the bit budget is maintained and that 
the bit rate at the output of the wavelet-based video encoder is 
substantially constant. 

There has thus been shown and described a novel apparatus and 
method that recursively adjusts the quantizer scale for each macroblock 
to maintain the overall quality of the video image while optimizing the 
coding rate. Many changes, modifications, variations and other uses 
and applications of the subject invention will, however, become apparent 
to those skilled in the art after considering this specification and the 
accompanying drawings which disclose the embodiments thereof. All 
such changes, modifications, variations and other uses and applications 
which do not depart from the spirit and scope of the invention are 
deemed to be covered by the invention, which is to be limited only by the 
claims which follow. 
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3. Experimental Results 

These are the coding results for the ci. C( a *> 

quantization , lep „ 8 , 0 For p 7n»e « «!S S M ' uenc «- *>r I frame. „. s „ lhe 
24. 48 kbps respectively for the ctai iSb2L2 m* tt0 " *" P " ,5, l0 ' 6 ■' >»• 

lhe target rate with this set of quaniiiatjo? ««m fi-. . ' M C0I "'° I no1 h " 
rare „ reached. The Q 2 approach JTZ ^IS'^ C «"»" 
3.1 Class A sequence 




Mother and Daughter 



Hal! Monitor 
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What is claimed is: 

1. Apparatus for encoding an input image sequence having at 
least one input frame, where said frame is partitioned into at least one 
block, said apparatus comprising: 

a "° ck compensator for computing a motion vector for the 

block and for generating a predicted image using said motion vector- 
. transform module, coupled to said block motion compensator 
for applying a transformation to a difference signal between the input 
frame and said predicted image, where said transformation produces a 
plurality of coefficients; ' 

a quantizer, coupled to said transform module, for quantizing said 
plurality of coefficients with a quantizer scale; 

a controller, coupled to said quantizer, for selectively adjusting 
said quantizer scale for a current frame in response to coding 
information from an immediate previous encoded portion; and 

a coder, coupled to said quantizer, for coding said plurality of 
quantized coefficients. 



2. The apparatus of claim 1, wherein said immediate previous 
encoded portion is an encoded frame and wherein said coding 
information from said immediate previous encoded portion is used to 
determine a projected number of bite for a frame V in the image 
sequence. 

3. The apparatus of claim 1, where said transform module 
applies a waveh* transform to produce a pjurali^. of wavelet. trees. 

4. The apparatus of claim 3, wherein said coding information 
from said immediate previous encoded frame is used to determine a 
buffer fullness measure before encoding a j-th wavelet tree from said 
plurality of wavelet trees. 
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5. The apparatus of claim 1, wherein said immediate previous 
encoded portion is an encoded macroblock and wherein said coding 
information from said immediate previous encoded portion is used to 
adjust a complexity model. 

5 - 

6. Method for encoding an input image sequence having at least 
one input frame, where said frame is partitioned into at least one block, 
said method comprising the steps of: 

computing a motion vector for the block; 
10 generating a predicted image using said motion vector; 

applying a transformation to a difference signal between the input 
frame and said predicted image, where said transformation produces a 
plurality of coefficients; 

quantizing said plurality of coefficients with a quantizer scale; 
1 5 selectively adjusting said quantizer scale for a current frame in 

response to coding information from an immediate previous encoded" 
portion; and 

coding said plurality of quantized coefficients. 

20 7. The method of claim 6, where said transformation applying 

step applies a wavelet transform to produce a plurality of wavelet trees. 

8. The method of claim 7, wherein said coding information from 
said immediate previous encoded frame is used to determine a buffer 

25 fullness measure before encoding a j-th wavelet tree from said plurality 
of wavelet trees. 

9. The method of claim 6, wherein said immediate previous 
encoded portion is an encoded macroblock and wherein said coding 

30 information from said immediate previous encoded portion is used to 
adjust a complexity model. 

10. Method for selecting an optimal quantizer scale for encoding 
an input image sequence having at least one input frame, where said 
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5 



frame is partitioned into at least one block, said method comprising the 
steps of: 

estimating a bit allocation for encoding the input frame; and 
selecting a quantizer scale for a current frame in response to 
coding information from an immediate previous encoded portion. 
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